AITopics | monte carlo evaluation

Collaborating Authors

monte carlo evaluation

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Monte Carlo (MC) methods are the most widely used methods to estimate the performance of a policy. Given an interested policy, MC methods give estimates by repeatedly running this policy to collect samples and taking the average of the outcomes. Samples collected during this process are called online samples. To get an accurate estimate, MC methods consume massive online samples. When online samples are expensive, e.g., online recommendations and inventory management, we want to reduce the number of online samples while achieving the same estimate accuracy. To this end, we use off-policy MC methods that evaluate the interested policy by running a different policy called behavior policy. We design a tailored behavior policy such that the variance of the off-policy MC estimator is provably smaller than the ordinary MC estimator. Importantly, this tailored behavior policy can be efficiently learned from existing offline data, i,e., previously logged data, which are much cheaper than online samples. With reduced variance, our off-policy MC method requires fewer online samples to evaluate the performance of a policy compared with the ordinary MC method. Moreover, our off-policy MC estimator is always unbiased.

machine learning, pdis, reinforcement learning, (12 more...)

arXiv.org Artificial Intelligence

2301.13734

Country:

North America > Canada > Alberta (0.14)
North America > United States > Virginia > Albemarle County > Charlottesville (0.04)
North America > United States > Massachusetts > Middlesex County > Belmont (0.04)
(6 more...)

Genre: Research Report (0.63)

Industry: Leisure & Entertainment > Games (0.92)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Data Science > Data Mining (0.93)

Add feedback

On the overestimation of widely applicable Bayesian information criterion

Imai, Toru

arXiv.org Machine LearningAug-28-2019

A widely applicable Bayesian information criterion (Watanabe, 2013) is applicable for both regular and singular models in the model selection problem. This criterion tends to overestimate the log marginal likelihood. We identify an overestimating term of a widely applicable Bayesian information criterion. Adjustment of the term gives an asymptotically unbiased estimator of the leading two terms of asymptotic expansion of the log marginal likelihood. In numerical experiments on regular and singular models, the adjustment resulted in smaller bias than the original criterion.

artificial intelligence, bayesian inference, machine learning, (11 more...)

arXiv.org Machine Learning

1908.10572

Country:

Asia > Japan (0.29)
North America > Canada > Ontario > Toronto (0.15)

Genre: Research Report (0.65)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.93)

Add feedback

Monte Carlo Methods for the Game Kingdomino

Gedda, Magnus, Lagerkvist, Mikael Z., Butler, Martin

arXiv.org Artificial IntelligenceJul-15-2018

Kingdomino is introduced as an interesting game for studying game playing: the game is multiplayer (4 independent players per game); it has a limited game depth (13 moves per player); and it has limited but not insignificant interaction among players. Several strategies based on locally greedy players, Monte Carlo Evaluation (MCE), and Monte Carlo Tree Search (MCTS) are presented with variants. We examine a variation of UCT called progressive win bias and a playout policy (Player-greedy) focused on selecting good moves for the player. A thorough evaluation is done showing how the strategies perform and how to choose parameters given specific time constraints. The evaluation shows that surprisingly MCE is stronger than MCTS for a game like Kingdomino. All experiments use a cloud-native design, with a game server in a Docker container, and agents communicating using a REST-style JSON protocol. This enables a multi-language approach to separating the game state, the strategy implementations, and the coordination layer.

artificial intelligence, machine learning, opponent, (16 more...)

arXiv.org Artificial Intelligence

1807.04458

Country: